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SET REDUCTION IN NONLINEAR EQUATIONS 



ERHAN TURAN AND ALI ECDER 



Abstract. In this paper, an idea to solve nonlinear equations is presented. 
During the solution of any problem with Newton's Method, it might happen 
that some of the unknowns satisfy the convergence criteria where the others 
fail. The convergence happens only when all variables reach to the convergence 
limit. A method to reduce the dimension of the overall system by excluding 
some of the unknowns that satisfy an intermediate tolerance is introduced. In 
this approach, a smaller system is solved in less amount of time and already 
established local solutions are preserved and kept as constants while the other 
variables that belong to the "set" will be relaxed. To realize the idea, an 
algorithm is given that utilizes applications of pointers to reduce and evaluate 
the sets. Matrix- free Newton-Krylov Techniques are used on a test problem 
and it is shown that proposed idea improves the overall convergence. 



1. Introduction 



Newton's Method (NM) is frequently used to solve systems of nonlinear equa- 
tions. This method starts with an initial guess, x° and after k updates, the iterate 

I/*-) ■ x fc satisfies the Equation Q] here f denotes discretized form of the held equations. 

^^ | To find an update, the linear system stated in Equation [2] should be solved at each 

step, J being the Jacobian Matrix such that J^ — 4^-. One Newton step is com- 

CO . pleted as current estimate of the solution vector is updated with Equation [3] where 

A acts as a damping parameter. 



(1) f(x)=0 

(2) JAx = -f 



X 

(3) x fe = x*- 1 + AAx 

Main advantage of the method is that the computations converge quadratically if 
the estimate is close to the exact solution. However, there are some downsides of the 
method. First is the selection of the initial guess. A good initial guess can lead to 
convergence in a couple of steps. To improve the initial guess, several ideas can be 
tried [8] . As a starter, the problem can be solved on a coarser mesh first and then the 
solution can be interpolated on to the fine grid. One can also perform continuation 
via the physical parameters of the problem, like the Reynolds number, Re, in CFD. 
Another idea to enhance the estimate is to use pseudo-transient algorithms at which 
the problem is allowed to advance in time at which the steady state will yield the 
solution of the discrete problem. 
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Another issue in NM is the solution of Equation [2] Two things should be con- 
sidered. The Jacobian, J should be formed at each step and proper linear solvers 
should be applied to calculate the update Ax. Formation of the Jacobian is not 
trivial to accomplish since the matrix is generally large and sparse and analytic 
representation of matrix is not available. Color-based evaluation techniques [5 or 
Automatic Differentiation [6 can be used for efficient computation of the Jacobian. 
Still, storage of the entries is the main concern and appropriate sparse storage 
schemes should be used [TO] . When Newton-Krylov Methods are utilized, than the 
formation of the Jacobian could be avoided (unless preconditioner is needed) and 
the matrix- vector products can be calculated with directional differencing pQ. 

One final concern is the global convergence. If the estimate is not close to the 
exact solution, i.e. when the computation starts from rather an uneducated guess 
- damping is required to calculate the solution. Using a constant A such that 
< A < 1 to damp the update is useful to improve the convergence. On the other 
hand, selection of the damping parameter should be performed with line search or 
trust region methods [3] so that full steps can be used as the iterates approach the 
exact solution. 

In the literature, several studies are performed on Newton's Method. This paper 
focuses on speeding up the computations by reducing the solution set. In the 
solution domain, it might happen that some points do not satisfy the convergence 
criteria as fast as the rest because of the local nonlinearities. These problematic 
nodes lag the convergence, in other words, the method continues until all unknowns 
satisfy the tolerance. In the meantime, already converged points are kept in the 
solution set. The idea is to isolate those points with high local residual and solve 
them alone. Hence the object of this study to analyze the performance of the 
subsets created while solving Equation [TJ 

This paper is organized as follows. In section [2j the set idea is presented and 
various solution strategies are discussed. Section [3] states the algorithm and gives 
insight about the implementation. In section HJ numerical techniques used in the 
study are explained and the performance of the presented idea is demonstrated on 
a test problem. Last section is devoted for conclusion and remarks for future work. 

2. Set Idea 

While solving a nonlinear system, the convergence of some of the unknowns 
might be different than others. These unknowns might be a member of a different 
equation or they might be located on challenging parts of the domain. In CFD for 
instance, continuity equation converges slower compared to momentum equations 
and yet the computations stop only if the mass conservation is satisfied. From this 
observation, one is can ask the following question: can the solution set be restricted 
into a subset such that the solver concentrates just on those gradually converging 
components? In set idea, the subset it is not restricted with just one equation 
or parts of the solution domain. All of the unknowns are handled at the same 
time and a subset should be selected which in turn reduces computational time but 
converges as before. Such a methodology requires two things, a criteria to decide 
on the subset and a data structure that can handle the changes on the solution set. 

Before proceeding, a few remarks on Newton's Method (NM) should be stated. 
When NM is tested for convergence, what is generally used as a criteria is the norm 
of vector, ||f || 2 . However, this norm is just a representative scalar which shows how 
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well the iterate, x fe satisfies Equation [TJ It is possible that for a not converged 
problem, some of the points are already satisfactory in terms of the convergence 
criteria and yet they are kept in the system until the solution is found. This 
observation reveals one thing; the solution might be performed on a smaller set as 
soon as some of the points satisfy the convergence criteria. Hence, the set idea is 
merely simple: Select some points on the solution domain with a "set rule" and 
perform computations only for the unknowns related to those points while keeping 
the others fixed i.e. freezed. The intention is isolate the points above the RMS 
limit and to solve only those points. Normally, it might be useful to use different 
Aj's for each Xi, so the nonlinearity can be controlled locally, however, selection 
of damping parameters one-by-one is a tedious task and the selection criterion is 
not obvious. Instead, points with similar behavior can be grouped together and 
handled separately. 

In the proposed idea, a Global set, denoted by Sq includes all of the unknowns 
of the problem. This is the set that the answer is looked for. A local set, Sl, on the 
other hand, is a subset of Sg which is determined in run-time. An obvious outcome 
is that Sl Q Sg- When both sets are equivalent, then Newton's Method performs 
as usual. If, however, a subset is generated than Equation [2] is solved for a smaller 
problem. 

In [3] , a special case of the proposed idea is presented. The so called Nonlinear 
Elimination is used to eliminate some of the unknowns-just to concentrate only on 
the harder part of the model. What is meant with hard is that parts of the domain 
are involved with high nonlinearities with degrades the efficiency of the solution 
process. To get rid of the difficulties, it is suggested to solve that particular domain 
with high accuracy while keeping other values fixed. As a next step, the unknowns 
are solved all together in such a way that the initial guess is improved by keeping the 
solution on the non-eliminated domain intact. What is missing in this methodology 
is the automation of the eliminated points. Eliminated nodal values are defined a 
priori. 

To decide on a set, three different methods are suggested to be used as a rule. 
Residual based criteria is stated in Equation [4] a < 1 is an additional parameter 
to control the tolerance. A small a will force a stricter selection criterion. As seen 
in Equation , Root-mean-square (RMS) is used to examine local residuals. Use of 
RMS is more appropriate to perform a comparison with local residuals, /^'s since 
the nonlinear norm, ||f || 2 is the sum of all local norms and it is obviously larger 
than all of the residuals. 

(A\ if J fi <a ll f HfflV/S' fi t S L 

( ' J I fi>oc\\i\\ RMS , heS L 

One other idea might be to compare local residual against the average of local 
residuals. This idea would be especially useful for problems with more than one 
local degree-of- freedom (dof). In multi-dof problems, dof can be compared against 
the averages of that particular dof itself instead of using an average of all unknowns. 



(5) */ ^ 



N 
N 

h>a\ Efj ) / N , fi£S L 
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Yet another idea is to compare the size of the unknown x% against the step size 
Axi. It is natural that the computations should continue if the update is larger 
than the value that it is applied to. To proceed one step further, one should keep 
the unknown in the set if the update is larger even if the two previous selection 
criteria are satisfied. 



(G) 



if 



I Ax,; I < \Xi 



fi £ s L 



\A Xi \ > \xi\ , fi&S L 



Note that first two criteria still work with constant values, i.e. the operation is 
identical at each point. Last criterion is, on the other hand, unique for each dof. 

In numerical tests, it is observed that residual based decision criteria might not 
improve the solution. The solution completes as Sg at its best if a strict criterion 
i.e. a small a is used. A looser criterion (a = 1) might need more Newton iterations 
as well as more linear iterations per Newton step. Also, the convergence history is 
not smooth as the global problem, though for shorter run-times one can sacrifice on 
that, and yet sample computations also take more time as demonstrated in section 

® 

Since the residual alone does not reflect the actual stand of the nonlinear problem 
well, the update should also be considered. As a matter of fact, the magnitude of 
the update should be compared with the solution as in the third rule. This condition 
is also used in Sg and essential to capture the physics [TTj . 

Major difference with the Nonlinear elimination is that the local set is defined 
during run time. Also it is not fixed, it can adapt itself as the solution proceeds 
with subsequent Newton iterations. Another distinction with Nonlinear elimination 
is that an initial attempt is made to calculate the first update vector Ace. The 
decision on Sl is made after this first step. Hence, first it is worked on global set 
and decide on the local set. Nonlinear elimination, on the other hand starts with a 
predetermined local set. Later, this information is used to improve the initial guess 
on the global set. 

When the initial norm is computed, the nonlinear residual is calculated. That 
means, a guess for the subset could be made with this piece of information. Unfortu- 
nately, initial estimate might lack essential information regarding to the physics (if 
for example a zero initial vector is used, convective terms in Navicr-Stokcs equations 
are ignored which are the source of nonlinearity at the first place!). Additionally, 
it is already suggested that magnitude of step sizes are also useful to control the 
subset. Ax, however, is set to be zero as a rule of thumb in Newton's Method so 
there is no information regarding the magnitudes of local updates beforehand. 
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Figure 1. Demonstration of the set idea 
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Consider Figure [TJ This is an example grid to demonstrate the set idea. Assume 
that the points marked with black circles are selected to be the local set. A index 
matrix is used to store the indices of the unknowns. The matrix structure shown in 
the figure has three layers. First layer is the absolute layer which keeps the indices 
of the global set. The second layer is the relative layer which stores the relative 
indices of the unknowns of Sl- Last layer, depicted as the flag layer is actually a 
layer of reference to construct the local set. At the beginning, all of the entries of 
this layer is 1 meaning that all of the variables are part of the local set. Depending 
on the set rule, either or 1 assigned for each entry. While scanning the domains 
to construct the set, point with flag values are ignored. Examining the figure, 
it can be seen that all rows are not fully covered, even j ' — 4 has no points in the 
set. In this model, size(5i) = 11 and size(<Sc) = 25. Therefore, f should return a 
vector with 11 elements only. 

3. Implementation 

When a new discretized problem is implemented into the solver, the model is 
presented for all of the unknowns. Adding special cases with if or switch clauses 
will slow down the computations. On the other hand, set idea seems to work 
only with if statements where the indices are checked over the flag layer. In this 
study, another approach is employed to evaluate f considering the local functions, 
/j's in the set. In order to understand the implementation of the set idea, it is 
constructive to advance step by step. To start, the unknowns have to be organized. 
The inventory is presented in Table [TJ i indices are given for each j with their 
respective absolute and relative indices. Absolute index is for the identification of 
the node on Sg and relative index is for Sl- One imminent outcome is that i's are 
actually members of j's. For a given row, the columns can be stored in vectors with 
varying lengths. The respective tree structure is given in Figure [2j As observed 

Table 1 . Index inventory of Figure [TJ 
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Figure 2. Set tree 
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from [21 for each j, an i vector is stored. Hence, instead of looping over i and j's, 
the cycles will be based on the vectors. This model is demonstrated in Table [2] In 
this code segment, jVec is a vector that stores the indices of the j that belong to 
the set. j 's on the other hand are derived types that keeps the indices of the i's per 
j in iVec. When Figure [2] is examined, it can be seen that the lengths of the iVec 
do differ. This is considered algorithmically while forming the local set. Local set, 
Sl, actually refers to these vectors that forms the tree-structure. With the use of 

Table 2. Classic loop vs. Set loop 



! classic loop style 
do j=2,n-l 

do 1=2, n-1 

end do 
end do 

! set loop style 

do myJ=l,size(domain7,jVec) ! access to jVec 

j =domain*/ j Vec (myj) 

do myI=l,size(domain'/,j (myJ)'/,iVec) 

i=domain'/ j (myJ)7,iVec(myI) ! access to iVec of j(mjJ) 

end do 
end do 



this idea, no if statement will be necessary. While solving Equation [2j the local 
set will be fixed and the computations will be based on Table [5] Note that, when 
the linear solver is used, it will operate on a smaller Ax whose size is 11 for this 
example model. 

While using the subsets, the solution vector x is not regenerated. Updates on 
the solution vector are applied regardless of the use sets. The key point is to use a 
vector, which is called setVec, that stores the absolute indices of the unknowns in 
S L . The vector for the example is setVec=[l, 2, 6, 7, 8,9, 13, 14, 22, 23, 24]. Thus, 
when the update is applied onto the solution following assignment in Equation [7] is 
used. When the solver works on the global set, then setVec=l : 25 so the original 
system is recovered. 

(7) x(setVec) = x(setVec) + AAx 

Now, the algorithm can be presented as in Table [3] At step 1, Newton's Method 
is applied on Sg- By default, only one iteration is performed to calculate the 
update, Ax and to use a more realistic f . Later in the second phase, the unknowns 
are tested against the set rules defined in section O New subset is solved in step 
4 and the update is applied on x using setVec in step 5. Last step is to check 
whether the solution satisfies Equation [1] 

In step 4, the problem is solved until the solution is achieved for the local set. A 
variant of the algorithm above can be used at which one avoids the accurate solution 
of the local set and performs just one Newton iteration. This algorithm actually 
performs just a single series of Newton Method where the size of the problem varies 
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Table 3. Set Algorithm 



(1) Perform a computation on Sg 

(2) Apply the set rule and modify the flag layer. 

(3) Form Sl and setVec 

(4) Solve S L 

(5) Update x on Sl and test f on Sq 

(6) If converged stop; else goto 1 



from one step to another step. This variant is also tested in this study. Before 
completing this section, it should be note that the set idea resembles the block- 
Gauss Seidel method. What is different is that the elements of the block have a 
diverse pattern on the domain. Also note that, an improvement on the idea is to 
use the sets as the Newton's Method proceeds. 

4. Results and Discussion 

For the solution of systems of nonlinear equations, Newton-GMRES is used. 
Inexact Newton Method [2 is applied with adaptive forcing parameter [3] . Matrix- 
vector products in GMRES are performed using the matrix-free methodology as 
in Equation |5J Selection of the perturbation parameter e is based on the idea 
presented in [T] . Combination of absolute and relative tolerances are used to define 
the nonlinear convergence tolerance as suggested in [7] with T a i, s = r re i = IE — 8. 
Line search backtracking is used to damp the updates [3]. Maximum number of 
nonlinear iterations is 20. 

(8) Jv « f ( X + ^- f W 

£ 

4.1. Test Problem. Test problem is taken from [!J| which reads as in Equation lijfl 
with boundary conditions u(0) = u(l) = 0. The exact solution is given in Equation 

ina 

(9) - u" + u 3 + Ux 10 8 (a: - 0.5) 2 - 2 x 10 4 ) u - 10 9 e _3 (^ra i ) 2 = 

(10) u(x) = 10 3 e~( ; Trar L ) 2 

The solution of the equation has a peak around x = 0.5 which complicates the solu- 
tion. In [9], the set is determined before solving the problem: For a 100 node grid, 
the initial set is fixed as nodes 49-52. In current study, these points are automati- 
cally determined using set rules. The solution of the problem is given in Figure [3] 
In Figure [4j Newton's Method is compared with the solution based on the set idea. 
Excluding the initial norms, Newton's Method converges in 7 steps whereas the set 
idea uses only 2 global iterations. In the sub solves of the set approach (i.e. step 
4 in Table [3]) 6 and 3 Newton steps are performed, respectively. Newton's Method 
required 34 total linear iterations where the set idea needs 33 linear iterations. Al- 
though, set method is 1 linear iteration shy, more linear iterations per nonlinear 



The authors believe there are typos in the equation given in the reference. The exact solution 
does not satisfy the ODE unless it is corrected as in Equation [9] 
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Figure 3. Solution of the test problem 

iteration is needed compared to Newton's method (16.5 vs. 4.9). In the first step 
of the set idea, the local set is identical to those points used in [9] , only 4 points 
(49:52) are kept in Sl- In the second iteration, however, the local set is expanded to 
12 points (45:56). In both cases, the local set is around x = 0.5 which makes sense 
because of the spike in the problem. This comparison is based by using average of 



io 1 V 


















— * — 


Set- Newton 








- ». 


— — - 


Newton 


10 s 








\ 

\ 




10 6 








\ 
\ 




I10 4 

£ 

a 
a> 

f 10* 

c 








\ 

\ 
\ 
\ 
\ 
1 
\ 




10° 








\ 
\ 
\ 




10 2 








\ 




10- 














2 


3 


4 5 6 7 8 


9 10 








# of iterations 





Figure 4. Comparison of the set idea with Newton's Method 
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the nonlinear residuals with a = 0.01. When a — 1, then set still converges in two 
steps but requires 35 total linear iterations. Use of RMS of f performs the same as 
the average values. Use of the step size criterion did not result a reduction in the 
set and the computations are performed as Newton's Method. When the number 
of unknowns is increased, it can be observed that the set idea requires more linear 
iterations, however, the computations are to be performed in less amount of time. 
In Table |H the methods are compared on different grid sizes where iters denotes the 
nonlinear iterations and the numbers in the parenthesis are total number of linear 
steps. Set size shows the size of the local set at each set iteration. The sets are 
created using the average values with a = 0.001. The efficiency of the proposed idea 
is apparent over the Newton's method when computational times are considered. 
The results in Table resembles |4] the results in [9] but speedup over the Newton's 
Method is not constant as in the reference. 

Table 4. Comparison in various grid sizes 



grid size 


Newton Time 


Newton Iters 


Set Time 


Set Iters 


Set size 


500 


0.4 s 


11(105) 


0.25 s 


3(137) 


20,36,26 


1000 


0.92 s 


10(159) 


0.51 s 


3(293) 


40,68,59 


5000 


111.17 s 


13(1422) 


17.38 s 


4(2101) 


196,311,287,498 



5. Conclusion 

In this study, an idea is presented to reduce the solution set of a nonlinear 
system. The restriction of the set can be carried out by using different rules that 
are based on the nonlinear residual as well as the step size. A new approach is 
demonstrated to keep track of the unknowns on the local sets and to evaluate the 
nonlinear system. The idea is tried on a test problem and it is observed that use 
of local sets improves the solution. 

This idea can be tested on harder problems in two dimensional geometries and 
with multi-dof systems. The variant of the algorithm can also be used to examine 
the effectiveness of the set idea. The method presented here is used only with finite 
differences. It can be extended to finite volumes and finite elements as well as to 
unstructured grids. A theoretical study was out of the scope of this paper, however, 
a numerical analysis could improve the selection of the sets and the tolerances. 
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